Search CORE

77 research outputs found

Kolmogorov's Structure Functions and Model Selection

Author: Vereshchagin Nikolai
Vitanyi Paul
Publication venue
Publication date: 01/01/2004
Field of study

In 1974 Kolmogorov proposed a non-probabilistic approach to statistics and model selection. Let data be finite binary strings and models be finite sets of binary strings. Consider model classes consisting of models of given maximal (Kolmogorov) complexity. The ``structure function'' of the given data expresses the relation between the complexity level constraint on a model class and the least log-cardinality of a model in the class containing the data. We show that the structure function determines all stochastic properties of the data: for every constrained model class it determines the individual best-fitting model in the class irrespective of whether the ``true'' model is in the model class considered or not. In this setting, this happens {\em with certainty}, rather than with high probability as is in the classical case. We precisely quantify the goodness-of-fit of an individual model with respect to individual data. We show that--within the obvious constraints--every graph is realized by the structure function of some data. We determine the (un)computability properties of the various functions contemplated and of the ``algorithmic minimal sufficient statistic.''Comment: 25 pages LaTeX, 5 figures. In part in Proc 47th IEEE FOCS; this final version (more explanations, cosmetic modifications) to appear in IEEE Trans Inform T

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Limit complexities revisited [once more]

Author: Bienvenu Laurent
Muchnik Andrej
Shen Alexander
Vereshchagin Nikolai
Publication venue
Publication date: 01/01/2012
Field of study

The main goal of this article is to put some known results in a common perspective and to simplify their proofs. We start with a simple proof of a result of Vereshchagin saying that

\limsup_n C(x|n)

equals

C^{0'}(x)

. Then we use the same argument to prove similar results for prefix complexity, a priori probability on binary tree, to prove Conidis' theorem about limits of effectively open sets, and also to improve the results of Muchnik about limit frequencies. As a by-product, we get a criterion of 2-randomness proved by Miller: a sequence

X

is 2-random if and only if there exists

c

such that any prefix

x

X

is a prefix of some string

y

such that

C(y)\ge |y|-c

. (In the 1960ies this property was suggested in Kolmogorov as one of possible randomness definitions.) We also get another 2-randomness criterion by Miller and Nies:

X

is 2-random if and only if

C(x)\ge |x|-c

for some

c

and infinitely many prefixes

x

X

. This is a modified version of our old paper that contained a weaker (and cumbersome) version of Conidis' result, and the proof used low basis theorem (in quite a strange way). The full version was formulated there as a conjecture. This conjecture was later proved by Conidis. Bruno Bauwens (personal communication) noted that the proof can be obtained also by a simple modification of our original argument, and we reproduce Bauwens' argument with his permission.Comment: See http://arxiv.org/abs/0802.2833 for the old pape

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Test Martingales, Bayes Factors and $p$ -Values

Author: Shafer Glenn
Shen Alexander
Vereshchagin Nikolai
Vovk Vladimir
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

A nonnegative martingale with initial value equal to one measures evidence against a probabilistic hypothesis. The inverse of its value at some stopping time can be interpreted as a Bayes factor. If we exaggerate the evidence by considering the largest value attained so far by such a martingale, the exaggeration will be limited, and there are systematic ways to eliminate it. The inverse of the exaggerated value at some stopping time can be interpreted as a

p

-value. We give a simple characterization of all increasing functions that eliminate the exaggeration.Comment: Published in at http://dx.doi.org/10.1214/10-STS347 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Royal Holloway Research Online

Crossref

Royal Holloway - Pure

Test martingales, Bayes factors, and p-values

Author: Shafer Glenn
Shen Alexander
Vereshchagin Nikolai
Vovk Vladimir
Publication venue
Publication date: 01/01/2009
Field of study

Project web site

CiteSeerX

Royal Holloway Research Online

Royal Holloway - Pure

Independent minimum length programs to translate between given strings

Author: Vereshchagin Nikolai K.
Vyugin Michael V.
Publication venue: Elsevier Science B.V.
Publication date: 28/01/2002
Field of study

AbstractA string p is called a program to compute y given x if U(p,x)=y, where U denotes universal programming language. Kolmogorov complexity K(y|x) of y relative to x is defined as minimum length of a program to compute y given x. Let K(x) denote K(x|emptystring) (Kolmogorov complexity of x) and let I(x:y)=K(x)+K(y)−K(〈x,y〉) (the amount of mutual information in x,y). In the present paper, we answer in the negative the following question posed in Bennett et al., IEEE Trans. Inform. Theory 44 (4) (1998) 1407–1423. Is it true that for any strings x,y there are independent minimum length programs p,q to translate between x,y, that is, is it true that for any x,y there are p,q such that U(p,x)=y, U(q,y)=x, the length of p is K(y|x), the length of q is K(x|y), and I(p:q)=0 (where the last three equalities hold up to an additive O(log(K(x|y)+K(y|x))) term)?

Elsevier - Publisher Connector

Probability-free pricing of adjusted American lookbacks

Author: Dawid A. Philip
de Rooij Steven
Grunwald Peter
Koolen Wouter M.
Shafer Glenn
Shen Alexander
Vereshchagin Nikolai
Vovk Vladimir
Publication venue
Publication date: 20/08/2011
Field of study

Consider an American option that pays G(X^*_t) when exercised at time t, where G is a positive increasing function, X^*_t := \sup_{s\le t}X_s, and X_s is the price of the underlying security at time s. Assuming zero interest rates, we show that the seller of this option can hedge his position by trading in the underlying security if he begins with initial capital X_0\int_{X_0}^{\infty}G(x)x^{-2}dx (and this is the smallest initial capital that allows him to hedge his position). This leads to strategies for trading that are always competitive both with a given strategy's current performance and, to a somewhat lesser degree, with its best performance so far. It also leads to methods of statistical testing that avoid sacrificing too much of the maximum statistical significance that they achieve in the course of accumulating data.Comment: 28 pages, 1 figur

arXiv.org e-Print Archive

Royal Holloway Research Online

Royal Holloway - Pure

Randomized Boolean Decision Trees: Several Remarks

Author: Nikolai Vereshchagin
Publication venue
Publication date: 01/01/1998
Field of study

Assume we want to show that (a) the cost of any randomized decision tree computing a given Boolean function is at least c. To this end it suffices to prove that (b) there is a probability distribution over the set of all assignments to variables of that function with respect to which the average cost of any deterministic decision tree computing that function is at least c. Yao in [11] showed that this method is universal for proving lower bounds for randomized errorless decision trees, that is, that (a) is equivalent to (b). In the present paper we prove that this is the case also for randomized decision trees which are allowed to make errors. This gives the positive answer to the question posed in [11]. In the second part of the paper we exhibit an example when randomized directional decision trees (defined in [7]) to evaluate read once formulae are not optimal. We construct a formula Fn of n Boolean variables such that the cost of the optimal directional decision tree computing Fn is..

CiteSeerX

Elsevier - Publisher Connector

On Algorithmic Rate-Distortion Function

Author: Nikolai Vereshchagin
Publication venue
Publication date
Field of study

Abstract — We develop rate-distortion theory in the Kolmogorov complexity setting. This is a theory of lossy compression of individual data objects, using the computable regularities of the data. I

CiteSeerX

Relationships between NP-sets, Co-NP-sets, and P-sets relative to random oracles

Author: Nikolai Vereshchagin
Publication venue: IEEE
Publication date
Field of study

In the present paper we prove that relative to random oracle A (with respect to the uniform measure) the following three assertions hold: (1) there is a pair of disjoint NP A -sets which are separable by no P A - set, (2) there is a pair of disjoint Co-NP A -sets which are separable by no P A -set and (3) there is an infinite Co-NP A -set having no infinite NP A -subset 1 Introduction Many important problems in Complexity theory remain open. The most known one is whether the classes P and NP are equal. It is also unknown if the class NP coincides with the class Co-NP and if NP " Co-NP = P: In the paper [1] it was shown that all these problems have no relativizable solutions. More exactly, oracles A and B were constructed such that P A = NP A (and, therefore, P A = NP A = NP A " Co-NP A ) and NP B 6= Co-NP B (and, therefore, P B 6= Co-NP B ). Using the same technique one can construct an oracle C for which NP C " Co-NP C 6= P C . As the rela..

CiteSeerX